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ABSTRACT 

We give a systematic, abstract formulation of the image normaliza- 
tion method as applied to a general group of image transformations, and 
then illustrate the abstract analysis by applying it to the hierarchy of view- 
ing transformations of a planar object. 
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1. Introduction and Brief Review of Viewing 
Transformations of a Planar Object 

A central issue in pattern recognition is the efficient incorporation of invariances with 
respect to geometric viewing transformations. We focus in this article on a particular method 
for handling invariances, called "image normalization" , which has the capability of extracting 
all of the invariant features from an image using only a small amount of information about 
the image (such as a few low order moments). The great appeal of normalization is that 
it isolates the problem of finding the image modulo the effect of viewing transformations, 
from the higher order problem of deciding which features of the image are needed for a 
specific classification decision. Intuitively, normalization is simply a systematic method 
for transforming from observer-based to image-based coordinates; in the former the image 
depends on the view, whereas in the latter the image is viewing transformation independent. 
From a mathematical viewpoint, our method consists of placing a set of constraints on the 
transformed image equal in number to the number of viewing transformation parameters, 
permitting one to solve either algebraically or numerically for the parameters of a normalizing 
transformation. Since the constraints are necessarily viewing transformation noninvariants, 
their construction is in general simpler than the direct construction of viewing transformation 
invariants. 

Let us begin our discussion with a quick review of the viewing transformations of a 
planar object, since these transformations will be used as illustrations of our general methods. 
(For further details, and a bibliography, see the excellent recent book of Reiss [15].) Under 
rigid 3D motions the image I(x), with x — (x±, x^) the two dimensional coordinate in 
the image plane, is transformed to I(x'), with x' related to x by the planar projective 
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transformation 



I _ J2 m =l G nm X m + t n _ 

ra _ 1 _i_ v^ 2 ' ~~ ' 



When the depth of the object is much less than its distance from the lens, then the parameter 
p n in Eq. (1) can be neglected, and Eq. (1) reduces to the linear affine transformation 



2 

m=l 



n ^ G nm X rn -\- t n . (2) 



[An affine transformation, with G nm replaced by G nm — t n p m , also results when Eq. (1) 
is expanded in a power series in x m and second and higher order terms are neglected.] 
Additionally, when the viewed object is constrained to lie in the plane normal to the viewing 
or 3 axis, Eq. (2) specializes further to the similarity transformation group of scalings, 
rotations, and translations, in which G nm is simply a multiple (the scale factor) of a two 
dimensional rotation matrix. The projective transformations, the affine transformations, and 
the similarity transformations all form groups, and this will be the characterizing feature of 
the viewing transformations studied in our general analysis. 

In applications, it will be convenient to use subgroup factorizations, which are readily 
obtained from the group multiplication rule for the transformations of Eqs. (1) and (2). For 
example, a general planar projective transformation can be written as the result of composing 
what we will term a restricted projective transformation 

< = 7 ( 3 ) 

with the general affine transformation of Eq. (2). Another subgroup factorization expresses 
the general affine transformation of Eq. (2) as the result of the composition of a pure trans- 
lation 

< = < + t n (4a) 



with a homogeneous affine transformation 
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x n ^ ^ G nm x m . (4o) 

m=l 

Yet a third subgroup factorization expresses a general homogeneous affine transformation 
as the result of composing what we will term a restricted affine transformation, which has 
vanishing upper right diagonal matrix element, 

2 

X 'n = 9nm X ™ ' 912 = ' ( 5 °) 

m=l 

with a pure rotation 

2 

a4 = ^ RnmXm , #11 = #22 = cos6», i? i2 = -R 21 = - sin 6> . (56) 

m=l 

A variant of Eqs. (5a-b) is obtained by requiring that the matrix g have unit determinant, 

so that it has the two-parameter form gn = u, gu = 0, #21 = w, (722 = u^ 1 , and then 

including a scale factor A in Eq. (5b), which now reads 

2 

x n A ^ ^ R nm x m . (5c) 

m=l 

2. General Theory of Image Normalization 

We proceed now to formulate a general framework for image normalization, with 
the aim of understanding the common elements of the various normalization methods 
which appear in the literature and of generalizing them to new applications. As a prelimi- 
nary to the mathematical discussion of Subsecs. 2A-E, we specify our notation for viewing 
transformations. Let Q = {S} be a group of symmetry or viewing transformations S, which 
act on the image I(x) according to 

I(x) ^ I s (x) = I(S(x)) . (6a) 
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Our notational convention, that we shall adhere to throughout, is that x! = 



S(x) is the 



concrete image coordinate mapping induced by the abstract group element S. [A specific 
example of such a transformation would be the planar projection transformation of Eq. (1), 
in which S would be the abstract element of the planar projective group characterized by 
the parameters G mn , t n , p m specifying the concrete coordinate mapping.] In this notation, 
the result of successive transformations with S\ followed by S2 is given by 



The transformation groups of interest to us are in general ones with continuous 
parameters, in other words, Lie groups, and the reader interested in more background on Lie 
group theory may wish to consult the texts of Gilmore [8] and Sattinger and Weaver [13]. 
However, very little of the formal apparatus of Lie group theory is required in what follows; 
basically, all we use is the group closure property and the enumeration of the number of 
group parameters. In particular, no knowledge of the representation theory of Lie groups is 
needed. 

A. The normalization recipe. We begin by giving the general prescription for an image 
normalization transformation. Let Nj(x) be a transformation of x which depends on the 
image /, and which is constructed so that under the image transformation of Eq. (6a), it 
behaves as 



I(x) - I S2Sl (x) = 



(66) 




(7a) 



with S 1 the inverse transformation to S of Eq. (6a), 



S(S-\x)) = x . 



(76) 
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Also, let Mj(x) be an optional second transformation of x which depends on the image / 
only through invariants under the group of transformations Q, that is, 

Mi s (x) = Mi(x), all S G G . (7c) 

Then 

I(x) = IiNjiMjix))) (8) 

is a normalized image which is invariant under all transformations of the group Q. This is 

an immediate consequence of Eq. (6a) and Eqs. (7a-c), from which we have 

I s (x) =I s (N Is (M Is (x))) 

^(SiS-^NriMjix))))) (9) 

=I(N I (M I (x))) = I(x) . 
B. Uniqueness. Before specifying how to actually construct a map Nj obeying Eq. (7a), let 

us address the issue of uniqueness. That is, given two maps Nu(x) and N 2 i(x), both of 

which obey Eq. (7a), how are they related? By hypothesis, we have 

N 1Is (x) =S ? - 1 (iV 1/ (5)) , 

N 2 i 3 (x) =5 ? - 1 (iV 27 (f)) . 
Since for any f(x) and g(x) we have 



(10) 



f{g{x))- l = 9- l {r l {Z)) , (Ha) 
we can rewrite the first line of Eq. (10) as 

N u \(x) = N u \S(x)) . {lib) 
Let us now define a new map Mi(x) by 

Mj{x) = Xy(7V 2J (£)) , (12) 



which reduces to the identity map when N u = N 2I ; then by Eq. (lib) and the first line of 
Eq. (10), we have 

Mi s (x) =N-l(N 2Is (x)) 

=iV 1 - 1 (S ? (S ? - 1 (iV 2/ (5)))) (13a) 

=N~\N 2I (x)) = M I (x) . 
In other words, Mj(x) depends on the image / only through invariants under transformations 

of the group Q, and from Eq. (12), the normalizing map N 2I is related to the normalizing 

map Nu by 

N 2I {x)=N u {M I {x)) . (136) 

This is why in writing the general normalized image corresponding to a particular normalizing 
map in Eq. (8), we have included in the x dependence the possible appearance of a map Mj 
which depends on the image only through invariants under transformation by elements of Q. 

— * 

C. Construction of Nj by imposing constraints, and demonstration that normalization yields 
a complete set of invariants. We next show that one can construct an image normalization 
transformation obeying Eq. (7a) by imposing a suitable set of constraints. We shall assume 
now that Q is a ^-parameter Lie group which is continuously connected to 
the identity. Let Cfc[J] = Ck[I(x)] , k = 1, K (where x is a dummy variable) be a set of 
functionals of the image I(x) with the property that the K constraints 

C k [I s ,\ = C k [I(S'(x))\ = , k — 1, K (14a) 

are satisfied for a unique element S' — Ni of Q, so that 

C k [I(N I (Z))] = 0, k = l,...,K. (146) 

Then, as we shall now show, Ni(x) is the desired normalizing transformation. 
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We remark that the condition that Eqs. (14a, b) should have a unique solution can 
be relaxed in applications to the condition that there be only one solution in the range of 
relevant viewing transformation parameters. Clearly, either form of the uniqueness condition 
requires that the constraint functionals not be invariants under Q, and thus their structure 
will in general be simpler than that of directly constructed viewing transformation invariants. 

In many cases, as we will see in Sec. 3 below, the constraints can be constructed 
from viewing transformation covariants, which have simple algebraic properties under the 
transformations of Q, permitting closed form algebraic solution for the parameters of the nor- 
malizing transformation. In more complicated cases, as discussed in Sec. 4, the constraints 
must be solved numerically for the normalizing transformation. 

To see that the construction of Eqs. (14a,b) gives a transformation Nj(x) that obeys 
Eq. (7a), let us consider the effect of replacing / by Is in Eqs. (14a, b). By hypothesis, the 
constraints 

C k [I s (S'(x))} = , k = l,...,K (15a) 

are uniquely satisfied by a group element S' = Ni s of Q, so that 

C k [I s (N Is (g))] = 0, k = l,...,K, (156) 

with N Is (x) the proposed normalizing transformation corresponding to Is- But using 
Eq. (6a), we can also write Eq. (15b) as 

C k [I(S(N Is (x)))} = , k — 1, K , (15c) 

which has the same structure as Eq. (14b). Therefore, by uniqueness of the solution Nj of 
Eq. (14b) we must have 

S(N Is (x)) = Nj(x) , (16a) 
9 



which by Eq. (7b) is equivalent to 

N Is (x)=S~ 1 (N I (x)) , (166) 

showing that the Nj produced by solving the constraints does indeed obey Eq. (7a). Hence 
the imposition of constraints gives a constructive procedure for generating image normaliza- 
tion transformations. 

— * 

We note that this construction makes the normalizing transformation Nj an element of 
the group Q, and the quotient Mi(x) = iVfj 1 (N 2 i (x) ) of two normalizing maps constructed 

— * 

by imposing different sets of constraints will likewise be an element of Q. When both Nj and 
Mj in Eq. (8) belong to Q, we can invert Eq. (8) to express the original image / in terms of 
the invariant, normalized image I according to 

I(x)=I(My\Ny\x))) . (16c) 

This equation shows that normalization leads to a complete set of invariants, in the sense 
that the information in the normalized image, plus the K parameters determining the view- 
ing transformation My 1 (Nj 1 (x)) , suffice to completely reconstruct the original image. By 
way of contrast, the representation-theoretic methods discussed in Sec. 6.5 of Lenz [12], and 
the integral transform methods of Ferraro [7], although attacking the same problem as is 
discussed here, yield only a small fraction of the complete set of invariants. Moreover, nor- 
malization has the further advantage of requiring only a minimal knowledge of the kinematic 
structure of the group; the full irreducible representation structure is not needed, and the 
methods described here are applicable to noncompact as well as to compact groups. We note 
finally that the discussion of this section is slightly less general than that of Sees. 3A and 

— * — * 

3B, where we did not require either Ni or Mj to belong to Q; the most general normalizing 
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map Nj is obtained from one generated by constraints by using as its argument a map Mj 
which does not belong to Q but that is invariant under transformations of the image I by Q. 
D. Extension to reflections and contrast invariance. We consider next two simple extensions 
of the constraint method for constructing the normalizing transformation. The first involves 
relaxing the requirement that Q be simply connected to the identity, as is needed if Q 
contains improper transformations such as reflections. Reflections are said to be independent 
if they do not differ solely by an element of the connected component of the group; for each 
independent discrete reflection R in Q, the set of constraints of Eq. (14a) must be augmented 
by an additional constraint D[I(S'(x))] > 0, where D[I{x)] is a functional of the image which 
changes sign under the reflection operation 
R, 

D[I(R(x))\ = -D[I(x)} . (17) 

The second extension involves incorporating invariance under changes of image contrast, 
that is, under image transformations of the form 

I(x) -> cl(x), c> . (18a) 

To the extent that illumination is sufficiently slowly varying that it can be treated as constant 
over a viewed object, changes in illumination level as the object is moved to different views 
take the form of changes in the constant c in Eq. (18a), which is why incorporating contrast 
invariance can be important. If we require that the constraint functionals Ck [and D if 
needed] should be invariant under the change of contrast of Eq. (18a), then the image 
normalization transformation Ni(x) and the auxiliary transformation Mj(x) can be taken 
to be contrast invariant. A contrast invariant normalized image I c {x) is then obtained by 
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the obvious recipe 



Ux) = - — . (186) 
1 ' / <PxI{x) 1 ' 



E. Use of subgroup decompositions. Suppose that for a general element S of the group Q, 
there is a subgroup decomposition of the form 

S = S 2 S 1 , (19a) 

with S 2 belonging to a subgroup Q 2 of Q, Si belonging to a subgroup Q\ of Q, and with the 
respective parameter counts K,K 1 , and K 2 of Q, Qi, and Q 2 obeying 

K = K 1 + K 2 . (196) 

(Such subgroup compositions for a general Lie group are obtained by constructing a com- 
position series for the group, but we will not need this formal apparatus in the relatively 
simple applications that follow.) Let us suppose further that we can solve the problem of 
image normalization with respect to the group Qi, and that we wish to extend this solution 
to the full invariance group Q. The subgroup decomposition allows this to be done by im- 
posing K 2 additional constraints to deal with the Q 2 subgroup, as follows. Let C 2 k[I(x)}, 
with k = 1, K 2 , be a set of functionals of the image chosen so that the constraints 

C 2 k[I(N 2 i(S 1 (x)))] = , k — 1,...,K 2 (20a) 

are independent of Si e Gi- In particular, taking Si as the identity transformation, Eq. (20a) 
simplifies to 

C 2k [I(N 2I (x))\ = , k = l,...,K 2 , (206) 

which if we impose the requirement of a unique solution over transformations A^2 £ Q2 
determines a "partial normalization" transformation N 2 i- Note that a sufficient condition 
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for the constraints of Eq. (20a) to be independent of Si is for the functional C 2 k to be 
Si -independent, but this is not a necessary condition; we will see examples in which, as 
Si traverses Qi, the functionals are merely covariant in some simple way that guarantees 
invariance of the constraints obtained by equating all the functionals to zero. To see how 

— * 

N 2I transforms under the action of the group Q, we replace I by Is in Eq. (20b), giving 

C 2fc [/ s (iV 2Js (x))] = 0, k = l,...,K 2 ; (21a) 

again making use of Eq. (6a) this becomes 

C 2k [I(S(N 2Is (x)))} =0 , k = l,...,K 2 . (216) 

Since the argument S(N 2Is (x)) appearing in Eq. (21b) is no longer a member of the Q 2 
subgroup, we cannot conclude that it is equal to the argument N 2I (x) appearing in Eq. (20b), 

— * 

but the arguments can differ at most by a transformation of x by some member S[ of the 
subgroup Q\ which leaves the constraints invariant, giving 

N 2Is (x) = S-\N 2I (S[(x))) (22a) 

as the subgroup analog of Eq. (7a). Corresponding to this, the partially normalized image 
defined by 

7(f) = I(N 2I (x)) (226) 

transforms under the group Q as 

I(x) ^ I s (x) =I s (N 2Is (x)) 

=I(S(S-\N 2I (S[(x))))) (22c) 
=I(N 2I (S[(m = HS[m, 
13 



and thus changes only by a transformation lying in the Q\ subgroup. 

Further image normalization of / using the constraints appropriate to Q\ then gives a final 
normalized image 

I(x)=I(N 2I (N li (M I (x)))), (23) 

which is invariant with respect to the full group of transformations Q, where as before Mj 
is any transformation which is constructed solely using Q invariants of the image. 

3. Viewing Transformations of a Planar Object 
With Algebraically Solvable Constraints 

We proceed now to apply the general image normalization methods of Sec. 2 to the 
viewing transformations of a planar object. In this section we focus on cases, corresponding 
to linear viewing transformations, in which suitable constraints can be formed using simple 
viewing transformation covariants, leading to algebraically solvable constraints. In the next 
section we will discuss more complicated cases, several of which use the transformations of 
this section as building blocks, in which some of the constraints must be solved by iterative 
methods. 

A. Translations. The translation subgroup of Eq. (1) is given by 

S(x)=x + t, (24a) 

corresponding to which Is = I(x + 1 ) describes an image translated by the vector —t. We 
take as the constraint functionals 

C k [I s ] = J d 2 x x k I(x + t), k = 1, 2 . (246) 
14 



The constraints Cu = 0, k = 1,2 can be solved explicitly for t by making the change of 
integration variable y — x + t, giving the unique solution t = tj, with ti the image "center 
of mass" 

f d 2 x x I(x) . . 

fl = f A2 FZV ' 25(1 

and the corresponding normalizing transformation is 

N I (x)=x + t I . (256) 

Under the action of the translation S, Eq. (25a) becomes 

f d 2 x x I(x + t) , 

ti s = —, ^ , 26a 

/ d 2 x I(x + t) 

which by a change of integration variable yields 

t Is =tj-t. (266) 

Thus the normalizing transformation of Eq. (25b) behaves as 

N Is (x)=x + ti -t = S- 1 ^ (x) ) , (26c) 

in agreement with the general result of Eq. (7a). In accordance with Eq. (8), the translation 
invariant image is 

7(f) = I(Nj(Mj(x))) = J(M/(f) + h) , (27a) 

with Mi(x) any transformation of x which depends only on translation invariant image 
features. Usually, one makes the choice Mj(x) = x — t , with t a constant vector which 
is independent of the image I. This constant vector can of course be taken to be zero, 
corresponding to the choice 

M 7 (£)=£, (276) 
15 



or it can be adjusted to center the translation invariant form of one particular image Io at 
any desired point. 

Once we have the translation normalized image I(x), all features extracted from it, 
such as all Fourier transform or wavelet transform amplitudes, are translation invariant. We 
illustrate this explicitly in the case of the Fourier transform, by showing how a translation 
invariant Fourier transform I(k) is related to the Fourier transform I(k) of the original image 
/(£), 

l(k) = J d 2 xe- i% - 3 I{x) 
=e tUl I(k), 

where we have taken Mj to be the identity map as in Eq. (27b), and where 



I(k) = J d 2 xe- ikS I(x) . (28b) 

Under translation, the Fourier transform of the original image I(k) behaves as 

I(k) -> I s (k) = J d 2 xe- &s I{x + ?) = e tlr I(k) , (28c) 

and is not invariant, but by Eq. (28b), the factor e %k ' 11 behaves as 

e ik-tj ^ e iU ls = e -ik-t e ik-tj ^ ( 28 ^ 

and has a compensating noninvariance, making the product I(k) appearing in Eq. (28a) 
invariant under image translations. 

B. Separation of affine normalization into translational and homogeneous affine normal- 
ization. Since rotations and scalings are special cases of homogeneous affine transforma- 
tions, before discussing them we use the subgroup decomposition method to give the general 
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procedure for separating the affine normalization problem into a translational normaliza- 
tion followed by a homogeneous affine normalization. We follow the general procedure of 
Eqs. (19a)-(23), taking the subgroup Q 2 to be the translations, and the subgroup Qi to be 
the homogeneous affine transformations, as in Eqs. (4a, b), so that 

5*2 (a 7 ) = x + t , Si(x) — G • x , 

(29a) 

S(x) =S 2 {S l (x)) = G-x + t, 

where the notation G ■ x denotes the vector with components Y^L=i Gnm x m . Applying the 
same translational constraint functionals C k of Eq. (24b) to the general affine transformation 
S of Eq. (29a), we have 

C k [I s \ = J d 2 x x k I(G ■ x + t) , (296) 
which on making the change of integration variable 

x -> S^(x) =G- X -x , (29c) 

gives 

Thus, although the translational constraint functionals C k are not independent of Si (or G), 
they simply mix linearly when Si is changed, and consequently the translational constraints 
Cfc = , k — 1,2 are Si-independent. This permits us to normalize out the translational 
part of a general affine transformation independently of the homogeneous affine transforma- 
tions, leading to a partially normalized image 

I(x)=I(x + t I ) (30a) 
17 



which is translation invariant. Under the full affine group, tj transforms as 

f d 2 x x I(G ■ x + t) 

/ d 2 x I(G-x + t) ' 1 ; 

which by the same changes of integration variable used before reduces to 

t Is = G' 1 ■ (fj - t ) . (30c) 

Hence the partially normalized image of Eq. (30a) transforms under the full affine group as 
I(x)^I s (x) =I s (x + t Is ) 

=I{G ■ [x + G- 1 ■ (fj - t)} + t) = I(G -x + ti) (31a) 

=/(G • x ) , 

in agreement with the general result of Eq. (22c). In other words, the partially normalized 
image / is translation invariant, and is acted on only by the homogeneous part of the affine 
transformation. In discussing similarity and affine transformations in Subsecs. C-F which 
follow, we will assume that we are always dealing with a partially normalized image which 
is translation invariant, but for simplicity of notation we will drop the tilde and simply call 
this image /. However, keeping the tilde for the moment, we note that the moments ji pq of 
this partially normalized image, which are called central moments, are defined by 



r-OO 

t^pq 



/CO pOQ 
dxi j dx 2 x\ x\ I{x) . (316) 
-OO J —CO 



C. Rotations. We begin the discussion of homogeneous affine transformations by considering 
pure rotations, with the group action 

S(x) = xg = (x 1 cos 6 — x 2 sin 9, X\ sin 6 + x 2 cos 6) , (32) 

corresponding to which Is(x) = I(xg) describes an image rotated by the angle —9. We shall 
assume, for the moment, that the image to be normalized has no rotational symmetry, in 
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which case we can take as the constraint functional 



C[I S ] = Phase 



Here we have used the notation 



J d 2 x e»<* ) I{x e ) 



Phasefz] = z/\z\ 



(33a) 



(336) 



for the complex number z; the function / is arbitrary [6] , and the functions $(5 ) and |x| 
are defined by 



$(x ) = arctan(x2/a;i) , \x\ = \ x\ + x\ . 



(33c) 



The constraint C[Is] = now uniquely determines an angle 9 — 9i, which can be calculated 
explicitly by making a change of variable x — > x^ e in Eq. (33a) and using the trigonometric 
formula 

. . . . —X\ sin 9 + x 2 cos 9 . . 4 . 

= arctan( -r—^) = - 9 , (34a) 



thus giving 



e lt>1 = Phase 



X\ cos 6 1 + x 2 sin 6* 



d 2 x e'*^ /(|£|) I(x) 



with the corresponding normalization transformation 



(346) 



Ni(x) = xqj = (xi cos 9i — x 2 sin^/,xi sin 9i + x 2 cos 9i) 



(34c) 



Under the action of the rotation S, Eq. (34b) is transformed to 



e i6l s = Phase 



d 2 x e**^ /I 



which making the change of variable x — > x_e and using Eq. (34a) gives 



(35a) 



0/ s — 9 1 — 9 . 



(356) 
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Thus the normalization transformation Nj becomes, under the action of S, 

N Is (x) = xe Is = xe^e = 5 _1 (7Vj(x)) , (35c) 

in agreement with Eq. (7a). Following the prescription of Eq. (8), the rotationally normalized 
image is 

7(f) = 7(iV 7 (M/(f))) = 7(M / (f), / ) , (36) 

with Mj(x) any transformation of f which depends only on rotationally invariant image 
features. Usually, one makes the choice Mj(x) = xg , with 9 a constant angle which is 
independent of the image 7. This angle can of course be taken to be zero, corresponding 

— # 

to the Mi of Eq. (27b), or it can be used to give the rotationally invariant form of one 
particular image a specified orientation. Note that the angle 6j used to construct the nor- 
malizing transformation contains useful information about the orientation of the image in 
the observer-centered coordinate system, which can be used to disambiguate images which 
have the same invariant form, but a different classification depending on their absolute ori- 
entation. For example, in some typefaces a 6 and a 9 have the same rotationally normalized 
form, but their 9i values will differ by ir, and so the value of 9i modulo 2n can be used to 
resolve the six-nine ambiguity. 

Up to this point we have assumed that the image to be normalized has no special 
rotational symmetries. Suppose now that 7 has an N— fold rotational symmetry, so that 
7(f) = 7(f 2vr /7v), and let us consider the integral 



j d 2 x e m ^ /(|f|) 7(f) = j d 2 x e m ^ /(|f |)7(f 27r/JV ) 

= J d 2 x e iM ^ s -^ /(|f|) 7(f) = e - iM2 * /N J d 2 x e iM * {£) /(|f|) 7(f) , 



(37a) 
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which implies that the integral 

J d 2 x e iM *^f(\x\) I(x) (37b) 

vanishes unless M/N is an integer (Hu [11]; Abu-Mostafa and Psaltis [2]). Consequently, 
when there is a rotational symmetry, the constraint of Eq. (33a) can no longer be used; 
instead, we must find the smallest value M = N for which the integral of Eq. (37b) is 
nonvanishing, and then normalize using the constraint functional Cn[Is] defined by 



C N [I S ] = Phase 



d 2 x e m ^ f(\x\) I(x e ) 



1 . (38a) 



Solving the constraint CV[is] = now determines an angle 9 = 9j, unique up to an integer 
multiple of 2tt/N, which can be explicitly calculated from 



e iNdl = Phase 



d 2 x e iN ^ £) f{\x\) I{x) 



(386) 



with the corresponding normalizing transformation and normalized image still given by 
Eqs. (34c) and (36). In practice, one does not know a priori the rotational symmetry of 
the image being normalized; one then deals with the possibility of rotational symmetry by 
taking the constraint functional to have the form of Eq. (38a) and including a loop over 
N = 1,2,... which terminates at the smallest value of N for which the integral used to 
construct the constraint is nonvanishing. 
D. Scaling. We turn next to scaling, with the group action 

S(x) = Xx , A > , (39) 

corresponding to which Is(x) = I(Xx) describes an image scaled in size by a factor A -1 . We 
take as the constraint functional C^ v [Is], /i^, given by 

f d 2 x\x l^($(f))/(Af) 
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with \x\ and as in Eq. (33c), and with g an arbitrary function. The constraint C^ u = 
determines a unique solution A = A/, which by making the change of integration variable 
x — > x/ A is readily found to be 

'/ d 2 x\x\^g(^(x))I(x)' 



J d 2 x\x\ v g(<b(x))I(x) 
The scale normalization transformation Nj(x) is then constructed as 



(406) 



N^x) = A/f , (40c) 

and under the action of the scaling S, is transformed to 

N Is (x) = X Is x. (40d) 

Writing the analog of Eq. (40b) for A/ s , substituting Eq. (39) and scaling A out of the 
integration variable as above, a simple calculation gives 

\ Is = A-% , (41a) 

and so the normalization transformation JVj becomes, under the action of S, 

N Is (x) = X^Njix) = S'^Njix)) , (416) 

again in agreement with Eq. (7a). Following the recipe of Eq. (8), the image normalized 
with respect to scaling is 

I(x) = IiNjiMiix))) , (42a) 

with Mj(x) any transformation of x which depends only on scaling invariant image features. 
The customary choice of Mj is Mj(x) = X x, with A an image-independent scale factor, 
which can be used to make the normalized form of one particular image have a specified 
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size. With this choice of Mj, the normalizing integral J d 2 xl(x) appearing in the contrast 
normalized image of Eq. (18b) can be explicitly calculated in terms of the central moment 
/loo, giving 

/ ^ - WW ■ m 

A useful specialization of the general method for scaling normalization is to take as the 
constraint 

o = £<w« 

m=^=o (43a) 
/ d 2 x log \x\ g(<&(x)) /(Ax) 

/ d 2 x #(<&(£)) I(Xx) 

By a change of integration variable this can be solved to give a unique value A = A/ given 
by 

f d 2 x log If I g(Q(x)) I(x) , IN 

and which transforms under S as 

logA /s =log(A- 1 A / ) , (43c) 

in agreement with Eq. (41a). A potential advantage of the logarithmic weighting factor 
in Eq. (43b), as compared with the power law weighting factors in Eq. (40a), is that the 
logarithm does not suppress the contribution arising from either the center or the periphery 
of the image. 

E. General similarity transformations. The general similarity transformation consists of a 
translation, a rotation, and a scaling, and so can be normalized using the methods of Subsecs. 
2A-D. The rotational and scaling normalizations can evidently be combined into a single 
step, in which the arbitrary functions and g($(x)) no longer appear, having been 
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replaced by the specific weightings appropriate to scaling and rotational normalizations. In 
the image normalized with respect to general similarity transformations, the undetermined 

— # 

map Mi(x) can now depend only on general similarity invariants of the image, and can of 
course be taken to be an image-independent map, including the identity transformation. 
F. Homogeneous affine transformations. We turn next to the general case of the homogeneous 
affine transformation, with the group action 

S(x) = G ■ x = (G n xi + G 12 x 2 , G 2 \xi + G 22 x 2 ). (44a) 

We again follow the subgroup decomposition method of Eqs. (19a)-(23), now taking (Dirilten 
and Newman [6]) 

the subgroup Q 2 to be the group of affine transformations with vanishing upper diagonal 

matrix element, and the subgroup Qi to be a rotation R, as in Eqs. (5a, b), so that 

S 2 (x) = g-x, S 1 (x)=x e , 

(446) 

S(x) = S 2 (S 1 (x)) = g ■ x e = G ■ x . 
To normalize the image with respect to the three-parameter group Q 2 , we will need three 

constraints, which following [15] we take as 

C k [I s ]=0, fc = l,2,3, (45a) 

with the constraint functionals Ci^^is] given by 

/ d 2 x x\ I(G ■ x) 
Cl[Isl ~ J <PxI(G-x) - 1 ' 

C 2 [I S ] = 5 fl^'f* - 1 , ( 45& ) 
J d 2 x I (G • x) 

C 3 [I S ] = J d 2 x Xl x 2 I(G-x) . 
Although the constraint functionals of Eq. (45b) are not independent of the element S\ of 
the subgroup Qi, that is, they are not ^-independent, they are easily seen to simply mix 
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into ^-dependent linear combinations of themselves as 9 is varied, and so the constraints of 
Eq. (45a) are ^-independent. Taking = and making a change of variable x — > g^ 1 ■ x, 



one can explicitly solve the constraints to give a unique solution g = gi, 

x 1/2 

9ii2 = , 

Mil / M02 ^20 — Mil 



5,721 ~~ 7 VL/2 ' 5,722 ~~ I 

IM20 Moo J 7 V M20M00 

Since the Schwartz inequality implies that /i^ < M02 M20, the matrix element (7/22 is always 

a real number. [Equation (46) assumes that yU 2 o is nonzero; if //20 vanishes and if / is not 

identically zero, then the moment /X02 will be nonvanishing, and so one can apply Eq. (46) 

after first rotating the image by 90 degrees.] The normalizing transformation for the subgroup 

S2 is now constructed as 

N 2 i{x)= gi -x. (47) 

By a lengthy algebraic calculation, one can verify that under a general proper (i.e., positive 
determinant) affine transformation S, the normalizing matrix gj transforms as 

gj -> g Is = G- 1 9I B! , (48a) 

with R' a rotation matrix which is a complicated function of the matrix elements of gj and 
of G. Hence under the action of S, the normalizing transformation for Q 2 becomes 

N 2 i s (x) =gi s -x 

(486) 

= (G- 1 9I R')-x = S~\N 2I (R'-x)) , 
in agreement with Eq. (22c). To obtain an affine normalized image, one of course does not 
need the explicit form of R'; one first forms the partially normalized image 

7(f) = I(N 2I (x)) , (49a) 
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and then normalizes with respect to rotations as in Eq. (36) of Subsec. 3C, to get the final 
normalized image 

i(x)=I(N 2I (Mj(x) e[ )) . (496) 

— * 

The map Mj is constructed only from afline invariants of the image; the simplest choice is 
Mj{x) — G • x, with G a fixed affine transformation which can be chosen to give the affinc 

— * 

normalized version of one particular pattern a specified form. With this choice of Mj, the 
normalization integral J d 2 xl{x) required for contrast normalization is explicitly given in 
terms of central moments by 



/ 



2 

X 1 } det Go det 9I det G (/i 02 tf» - /4) 1 / 2 ' 1 ' 



An alternative way to normalize the homogeneous afline transformations is to use the 

subgroup factorization [c.f. Eq. (5c)] 

S 2 (x) = g f ■ x , S^x) = A x e , 

(51a) 

S(x) = S2(Si(x)) — g' • A xe — G • x , 
with g' restricted to have both zero upper right diagonal matrix element and unit determi- 
nant. Since g' and Q 2 now contain only two parameters, and since Q\ now includes both 
rotations and scalings, partial normalization with respect to Q 2 requires two constraints 
which must be both rotation and scaling invariant. Inspecting Eq. (45b), we see that an 

obvious choice of constraint functionals is now 

C[[I S ]= ! d 2 x (x 2 2 - x\) I(G-x) , 

\ (516) 
C' 2 [I S ] = / d 2 xx x x 2 I{G-x) . 

Again, although these functionals are not rotation and (in the case of C 2 ) scale invariant, the 

constraints C[ — 0, C' 2 = are invariant, and solving them gives the not surprising result 

gf I = <Mgi) 112 gi , (51c) 
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with gi as given in Eq. (46). The normalizing transformation for the subgroup S2 is now 
constructed as in Eq. (47), with g\ replacing gj, and the partially normalized image is again 
given by Eq. (49a), but now the final step leading to a fully afline normalized image consists 
of a further combined normalization with respect to rotation and scaling of the type described 
in Subsec. 3E. 

4. Viewing Transformations With Numerically Solvable Constraints 

In this section we continue with the application of the general image normalization 
methods of Sec. 2 to the viewing transformations of a planar object, focusing on cases in 
which the constraints are not all algebraically solvable, so that iterative numerical methods 
are needed. We then go on to consider some other normalization problems of interest, that 
can also be solved by iterative methods. 

A. Projective transformations. So far we have discussed linear transformations S(x), which 
within the general normalization framework of Sec. 2 lead to algebraically solvable con- 
straints. We turn now to nonlinear transformations, to which the general analysis also 
applies, beginning with the planar projective transformation for which S(x) is given by 

S(x) = ^ m=1 G 2 ™ Xm + tn . ( 52a ) 

1 + Em=l P m Xm 

We again use the subgroup decomposition method, writing 

S(x) = , (526) 

with S2 € Q2 a restricted projective transformation and Si G Q\ an affine transformation, 
as in Eq. (3) and Eq. (2) respectively. Since Q2 is a two-parameter Lie group, we need two 
constraints, which must be invariant under the action of the affine transformations of Qi, to 
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partially normalize the image. We have not been able to find two simple constraint function- 
als which yield algebraically solvable affine invariant constraints when equated to zero, which 
would be the analog of our previous two applications of the subgroup method. Instead, we 
work with constraint functional which are fully affine and contrast invariant, as obtained 
by the algebraic methods of Hu [11] and Reiss [14], which because of their complexity must 
be solved numerically. (Alternatively, one could formulate the projective constraints using 
two independent affine invariants constructed by the affine normalization procedure of the 
preceding section, again solving the constraints numerically. We emphasize that in either 
case, the constraints used for projective normalization are not projective invariants, but only 
invariants under the much simpler affine subgroup of the full projective group.) Using only 
third and lower central moments, one can form the following three functionals of the image 
which are affine and contrast invariant, and which are non-singular (in fact vanishing) for 
images with both X\ and x 2 reflection symmetry, 



h 




* 2 [/] 




* 3 [/] 



If 



{ad - be) 2 



A(ac-b 2 ){bd-c 2 ) , 



A(bd - c 2 ) 



B(ad - be) + C{ac - b 2 ) , 



(53a) 



h 



a 2 C 3 - QabBC 2 + 6acC(2B 2 - AC) + ad(6ABC - 8B 3 ) + 9b 2 AC 2 



ISbcABC + 6bdA(2B 2 - AC) + 9c 2 A 2 C - 6cdA 2 B + d 2 A 3 , 



A 



^20 ) B — fin ; C — ^02 , 



a = ^30 , b = H21 , c = H12 , d = fi 03 . 
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For example, C\ [I] = C 2 [I] = could be used as constraints, with 

d [/]=*i [/] - *? , C 2 [I] = V 2 [/] - , (536) 

provided the numerical target values Vl/?,^ fall within the ranges taken by *i,^2 for the 
image I being normalized. Since the restricted projective transformation 

— * 

S 2 (x) = (54a) 

v ; 1+p-x V ; 

depends nonlinearly on the parameter p, we cannot algebraically solve the constraints to 
find the normalizing parameter p T , but this can be readily done numerically by an iterative 
method. The partial normalization transformation and partially normalized image are now 
given by 

N 2I (x 



l+pi-x' ( 546 ) 
I(x) = I(N 2I (x)) . 

Finally, one must do a further affine normalization, as in Subsecs. 3B and 3F, 
to get an image normalized with respect to the full planar projection group. If the initial 
image is not well-centered on the raster, it may be advantageous to also do an affine normal- 
ization before the restricted projective partial normalization; this does not affect the results 
provided a second affine normalization is still done as the final normalization step. 

The fact that one must know the range of ^1,^2 to pick target numerical values for 
normalization may prove a significant limitation, since it is likely that there is no universal 
pair of target values which is guaranteed to be attainable for any image. (For a discussion 
of related problems with projective normalization, see Astrom [1].) Consequently, it may 
be necessary to have a preliminary classification of the viewed object before attempting 
projective normalization. However, this may not be a problem in some applications, as for 
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example when an approaching object is tracked and can be classified when it is still far enough 
away for the affine approximation to the general projective transformation to be accurate. 
As the object gets closer, knowledge of its class can be used to determine the constraints to 
be used in projective normalization, and the values of pi and the affine parameters obtained 
from projective normalization can then be used to deduce information about the object's 
absolute orientation. Another option, in applications where preliminary classification by an 
affine normalizing classifier is feasible, is to use optimization of the match M through the 
classifier to supply the projective constraints; that is, the constraints are taken as 

dM dM 
op x op 2 

which are solved by iteration on the restricted projective parameter p to determine pj. 
B. Similarity and Affine Normalization of Partially Occluded Planar Curves. So far we have 
discussed only the normalization of non-occluded images, but the general methods formulated 
in Sec. 2 have been extended by Adler and Krishnan [3] to the more realistic problem [4], 
[5] of the similarity and affine normalization of a partially occluded planar curve, such as 
that characterizing the boundary of a partially occluded planar object. Since full details 
and illustrative numerical results are given in [3], we give here only a sketch of the strategy. 
Consider, for simplicity, the special case in which one has a curve segment, with an identifying 
point P, distorted by an affine transformation. One can construct an affine normalization by 
the method of Sec. 3, by forming constraints using second moments of the curve integrated 
along a finite segment from P to some neighboring point P' . To specify this integration 
segment in an affine invariant way, reference [3] requires that the normalized image of the 
segment have some specified reparameterization invariant arc-length, giving one additional 
constraint that must be solved by numerical iteration. The resulting normalization procedure 
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for partially occluded planar curves normalizes against affine transformations using as input 
only first and second parametric derivatives, i.e., only information about the tangent vector 
and the curvature of the curve. This example shows how the general methods of this paper 
can be used as modules in iterative procedures 

to solve new, previously unsolved, classes of normalization problems. 

C. Flexible template normalization. The nonlinear projective image transformations which 
we have just discussed are only one example of much more general nonlinear distortions 
which can make an observed image differ in form from the standard prototype for its class. 
Examples of such distortions include non-planar geometric effects when a character to be 
recognized is printed on a curved surface, variations among hand lettered characters produced 
by different individuals, and variations in facial geometry as a result of changes in facial 
expression. An attractive proposal for dealing with such distortions is the use of "flexible" 
or "deformable" templates (see, e.g. [9]), and the general normalization methods of Sec. 2 
give a possible means for their implementation. We consider briefly in this subsection the 
case of distortions which can be modeled as an image transformation 



where x' = f(x) is a general nonlinear remapping or diffeomorphism of the image coordinate 
x. Such diffeomorphisms form a group, and so the general analysis of Sec. 2 formally 
applies, but since the general diffeomorphism group has an infinite number of parameters, 
this observation is of little practical use without making further assumptions. Let us now 
suppose that the predominant nonlinear distortions can be treated as small in magnitude, 
and are well represented by a few terms in an appropriate complete expansion basis. A 
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concrete example would be nonlinear distortions described by the transformation 

T(x)=t + G-x + H-xx + J-xxx, (566) 

with the H term shorthand for the vector with components J2 mp H nmp x m x p , and similarly 
for the J term. When H, J are effectively of order unity, the transformations of Eq. (56b) do 
not form a group, since iteration of the transformation leads to fourth and higher order terms 
in x. However, if H, J are small enough for terms quadratic and higher order in H, J to be 
neglected, the transformations of Eq. (56b) do, within the first order approximation, form a 
group, and the normalization methods of Sec. 2 become applicable. One could then proceed 
by first constructing an afline preprocessing classifier by the methods of Subsecs. 3A-F, 
thus normalizing for the linear transformation given by the first two terms of Eq. (56b). One 
would then normalize with respect to all of Eq. (56b) by iterating on the coefficients H, J to 
try to get an optimal unique classification through this classifier, using the cost function 

C = | (classifier mismatch) | + \ \H\ \ + \ \J\ \ , (56c) 

with the terms in Eq. (56c) giving respectively measures of the magnitude of the classifier 
mismatch for the class being considered, the magnitude of the coefficients H, and the mag- 
nitude of the coefficients J. Clearly, a similar method could be applied to the expansions 
of T on any polynomial (and perhaps more general) basis, provided the truncated basis 
is left invariant in form within a suitable first order approximation, for which the affine 
transformations form the zeroth order approximation. 

D. Normalization of an Image on a Sphere. As a final application of the methods of Sec. 2, 
we briefly discuss the normalization of an image defined on the surface of a sphere of 
radius R and angular variables f2, with respect to the group of rotations S of the sphere. We 
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shall confine ourselves to the simplest case, in which the image I has no special symmetries 
which make the relevant constraint integrals vanish. The normalization can be carried out 
in two steps. The first step imposes the constraint 



with njj the outward pointing three dimensional unit normal to the sphere at f2. This 
rotates the sphere so that the positive x% axis (the north polar axis of the sphere) passes 
through the center of mass (calculated in spherical geometry) of the image. The second 
step consists of a rotational normalization with respect to azimuth (or longitude) using 
the formulas of Subsec. 3C, in which dependences on \x\ are replaced by dependences on 
the spherical polar angle (or latitude). In the group contraction limit in which the sphere 
radius R approaches infinity while the dimension of the region of support of the image 
remains bounded, this normalization recipe reduces to that of Subsecs. 3A, C for combined 
translational and rotational normalization of a planar image. 



We have given a general normalization method for viewing transformations of planar im- 
ages, based on imposing a set of constraints equal in number to the parameters of the viewing 
transformation group, the solution of which gives the parameters of the normalizing trans- 
formation. In Sec. 3 we discussed linear viewing transformations, for which algebraically 
solvable constraints can be given. In Sec. 4 we discussed more complex situations, in which 
some of the constraints cannot be solved algebraically, but can be solved by numerical it- 
erative methods. Although the normalization methods of Subsecs. 3A-F and 4A-B were 
all based on the use of moments or other weighted integrals of the image to construct the 




/ <mi s (Sl) n n 



(57) 



5. Summary and Discussion 
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constraints, the general analysis of Sec. 2 does not require this. Alternative methods include 
setting the scale normalization by a determination of the outer boundary of the image (as in 
[10], [16]), and setting the rotational normalization angle after scale normalization by using 
the maximum of I in an annular ring of given radius, both of which are methods that use 
local image features instead of weighted integrals over the image. 

The most convenient set of constraints will, in practice, depend on the specifics of the 
invariance problem being analyzed. In general, the larger the number of constraints that 
can be solved algebraically, and the smaller the number that require numerical solution, the 
more computationally efficient will be the resulting normalization method. For this reason, 
we have given particular emphasis to subgroup methods, that express some of the constraints 
needed for more complex normalization problems in terms of those already constructed for 
simpler normalization problems, for which algebraic solution methods are available. 

In conclusion, we emphasize that according to the general theory established in Sec. 
2 and illustrated in Sees. 3 and 4, any set of constraints that uniquely breaks the viewing 
transformation group invariance suffices to construct a normalization, and thereby to yield 
all viewing transformation invariants. The difference between normalizations constructed 
using alternative sets of constraints will always be representable by a residual mapping of 
the image, depending on the image only through viewing transformation invariants. 
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